Collecting your Own Data

Now that you have your idea for a robotic inference project, you need to collect some data to train it. The data is just as important as the network! You need both the right data and a properly tuned network to achieve great results.

**Important Note: ** There is a 3 GB limit to your Workspaces home directory, /home/workspaces/. Please make sure that you do not exceed this limit, as you may get locked out of your workspace. You can check your directory size anytime, by running the following command -

$ du -sh /home/workspace

Classification Network

If you want to create a classification network, then at least 400 images are recommended per class but it can vary. This number is very subjective but is a good starting point. It will be up to you to determine the right number of samples for your project. Your network may do well at learning one class but struggle with another and therefore more of that data will need to be collected. Also, remember to collect images in the same environment in which you will be conducting your inference.

Detection Network

If you decide to work with a detection network, you will need to annotate your data before uploading it to DIGITS. This means that you need to put bounding boxes around what you want your network to learn. There are a lot of different software applications out there that can help expedite this for you. Here are some options for image annotation and there are many more that can be found with the help of an online search. Choose one you think you will be most proficient in.

Collecting the images

There are a number of ways to collect images. You can use a webcam and a Python or C++ script to collect data, for example. Some people have used phones to collect data. If you have a Jetson, it can be used to collect data as well.

We will provide a basic python script for collecting images from your webcam but you are encouraged to find other methods as well.

If you are limited on upload speed for your large data set, it is advised to upload your data to a Google Drive or other fast cloud storage. Once it’s uploaded, you can download your data into your instance in a very short amount of time.

Example Python Data Capture Script

Note: You have to setup the proper environment with Python 2.7 before running this script using cv2:
conda install -c conda-forge opencv=2.4

import cv2

# Run this script from the same directory as your Data folder

# Grab your webcam on local machine
cap = cv2.VideoCapture(0)

# Give image a name type
name_type = 'Small_cat'

# Initialize photo count
number = 0

# Specify the name of the directory that has been premade and be sure that it's the name of your class
# Remember this directory name serves as your datas label for that particular class
set_dir = 'Cat'

print ("Photo capture enabled! Press esc to take photos!")

while True:
    # Read in single frame from webcam
    ret, frame = cap.read()

    # Use this line locally to display the current frame
    cv2.imshow('Color Picture', frame)

    # Use esc to take photos when you're ready
    if cv2.waitKey(1) & 0xFF == 27:

        # If you want them gray
        #gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

        # If you want to resize the image
        # gray_resize = cv2.resize(gray,(360,360), interpolation = cv2.INTER_NEAREST)

        # Save the image
        cv2.imwrite('Data/' + set_dir + '/' + name_type + "_" + str(number) + ".png", frame)

        print ("Saving image number: " + str(number))

        number+=1

    # Press q to quit the program
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Example C++ script using the Jetson

/*
Compile with:

gcc -std=c++11 Camera_Grab.cpp -o picture_grabber -L/usr/lib -lstdc++ -lopencv_core -lopencv_highgui -lopencv_videoio -lopencv_imgproc -lopencv_imgcodecs

Requires recompiling OpenCV with gstreamer plug in on. See: https://github.com/jetsonhacks/buildOpenCVTX2

Credit to Peter Moran for base code.
http://petermoran.org/csi-cameras-on-tx2/
*/


#include <opencv2/opencv.hpp>
#include <string>

using namespace cv;
using namespace std;

std::string get_tegra_pipeline(int width, int height, int fps) {
    return "nvcamerasrc ! video/x-raw(memory:NVMM), width=(int)" + std::to_string(width) + ", height=(int)" +
           std::to_string(height) + ", format=(string)I420, framerate=(fraction)" + std::to_string(fps) +
           "/1 ! nvvidconv flip-method=0 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink";
}

int main() {
    // Options
    int WIDTH = 500;
    int HEIGHT = 500;
    int FPS = 30;

    // Directory name
    string set_dir = "Test";
    // Image base name
    string name_type = "test";

    int count = 0;

    // Define the gstream pipeline
    std::string pipeline = get_tegra_pipeline(WIDTH, HEIGHT, FPS);
    std::cout << "Using pipeline: \n\t" << pipeline << "\n";

    // Create OpenCV capture object, ensure it works.
    cv::VideoCapture cap(pipeline, cv::CAP_GSTREAMER);
    if (!cap.isOpened()) {
        std::cout << "Connection failed";
        return -1;
    }

    // View video
    cv::Mat frame;
    while (1) {
        cap >> frame;  // Get a new frame from camera

        // Display frame
        imshow("Display window", frame);

        // Press the esc to take picture or hold it down to really take a lot!
        if (cv::waitKey(1) % 256 == 27){

            string string_num = to_string(count);

            cout << "Now saving: " << string_num << endl;

            string save_location = "./" + set_dir + "/" + name_type + "_" + string_num + ".png";

            cout << "Save location: " << save_location << endl;

            imwrite(save_location, frame );

            count+=1;

        }

        cv::waitKey(1);
    }
}